Fix YAML scientific notation parsing as float #2913

koxudaxi · 2026-01-03T12:55:06Z

Fixes: #1955

Summary by CodeRabbit

New Features
- Added support for scientific notation in YAML files without decimal points (e.g., 1e-5, 1E+10)
- YAML parser now correctly interprets scientific notation as float values across positive and negative exponents

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-03T12:55:16Z

📝 Walkthrough

Walkthrough

Introduces YAML scientific notation pattern recognition and extends the CustomSafeLoader to properly resolve scientific notation values (e.g., 1e-5, 1E+10) as floats rather than strings during YAML parsing, fixing a bug where such numeric defaults were being cast to string types.

Changes

Cohort / File(s)	Summary
YAML Parser Enhancement `src/datamodel_code_generator/util.py`	Added `_YAML_SCIENTIFIC_NOTATION_PATTERN` regex to detect scientific notation without decimal points; extended `CustomSafeLoader.get_safe_loader()` to register implicit YAML resolvers for scientific notation as float type across sign and digit characters.
Test Fixture `tests/data/expected/main/yaml/scientific_notation.py`	New generated Pydantic model with four optional float fields featuring scientific notation defaults: `exponential_default` (1e-05), `positive_exp` (20000000000.0), `negative_prefix` (-30000.0), `with_decimal` (1.5e-05).
Test Coverage `tests/main/test_main_yaml.py`	Added `test_main_yaml_scientific_notation()` to validate YAML-to-Pydantic code generation with scientific notation input; verifies output matches expected model file.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit hops through YAML's den,
Where numbers danced in strange fashion then,
But now 1e-5 stays float, stays true,
No stringy tricks—science notation shines anew! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix YAML scientific notation parsing as float' directly matches the main change: introducing YAML scientific notation pattern recognition to parse such values as floats instead of strings.
Linked Issues check	✅ Passed	The PR successfully addresses issue #1955 by introducing _YAML_SCIENTIFIC_NOTATION_PATTERN and extending CustomSafeLoader to treat scientific notation as floats, with test coverage demonstrating the fix works for values like 1e-5, 20000000000, -30000, and 1.5e-05.
Out of Scope Changes check	✅ Passed	All changes directly support the objective of fixing YAML scientific notation parsing: the util.py modification adds the core fix, the test YAML file provides test data, the expected output file demonstrates correct parsing, and the test function validates the functionality.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c83470e and 14ca21d.

⛔ Files ignored due to path filters (1)

tests/data/yaml/scientific_notation.yaml is excluded by !tests/data/**/*.yaml and included by none

📒 Files selected for processing (3)

src/datamodel_code_generator/util.py
tests/data/expected/main/yaml/scientific_notation.py
tests/main/test_main_yaml.py

🧰 Additional context used

🧬 Code graph analysis (1)

tests/main/test_main_yaml.py (1)

tests/main/conftest.py (2)

output_file (99-101)

run_main_and_assert (245-409)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Analyze (python)
GitHub Check: py312-isort7 on Ubuntu
GitHub Check: 3.10 on Ubuntu
GitHub Check: py312-isort6 on Ubuntu
GitHub Check: 3.14 on Windows
GitHub Check: 3.10 on Windows
GitHub Check: 3.13 on Windows
GitHub Check: 3.12 on Windows
GitHub Check: 3.11 on Windows
GitHub Check: 3.14 on Ubuntu
GitHub Check: benchmarks

🔇 Additional comments (4)

src/datamodel_code_generator/util.py (2)

62-64: Excellent pattern definition for scientific notation.

The regex pattern correctly matches scientific notation without decimal points (e.g., 1e-5, 1E+10), addressing the core issue where such values were being parsed as strings. The pattern properly handles:

Optional leading signs (-, +)

Mantissa with optional underscores for readability (YAML 1.2 compatible)

Case-insensitive exponent marker (e, E)

Signed exponents

The explanatory comments clearly document the purpose and the gap in standard YAML parsing.

110-116: Scientific notation pattern registration is correct.

The pattern is correctly registered for all relevant starting characters (-, +, and digits 0-9) and properly tagged as a YAML float. The implementation successfully resolves scientific notation without decimal points (like 1e-5) as float values, addressing the issue where standard YAML treats these as strings.

tests/main/test_main_yaml.py (1)

78-91: Well-structured test for scientific notation handling.

This test effectively verifies the fix for issue #1955. The test:

Uses jsonschema input type, appropriate for testing default value parsing

Explicitly specifies pydantic_v2.BaseModel output to ensure consistent behavior

Includes a clear docstring explaining the expected behavior (scientific notation parsed as float, not string)

Follows the established testing pattern in this file

tests/data/expected/main/yaml/scientific_notation.py (1)

10-14: Expected output correctly demonstrates the fix.

This expected output validates that scientific notation values are now parsed as float literals rather than strings:

exponential_default: float | None = 1e-05 - scientific notation preserved as float literal ✓

positive_exp: float | None = 20000000000.0 - large value in decimal form (Python's repr choice) ✓

negative_prefix: float | None = -30000.0 - negative value in decimal form ✓

with_decimal: float | None = 1.5e-05 - scientific notation with decimal point preserved ✓

Before the fix, these would have been generated as strings like '1e-05'. The fix ensures they remain numeric literals, which is the correct behavior per issue #1955.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-03T12:56:10Z

📚 Docs Preview: https://pr-2913.datamodel-code-generator.pages.dev

codspeed-hq · 2026-01-03T12:58:15Z

CodSpeed Performance Report

Merging #2913 will degrade performance by 17.71%

_{Comparing fix/yaml-scientific-notation-1955 (14ca21d) with main (c83470e)}

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

❌ 11 regressions
⏩ 98 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	WallTime	`test_perf_graphql_style_pydantic_v2`	695.7 ms	829.1 ms	-16.09%
❌	WallTime	`test_perf_duplicate_names`	839.6 ms	1,020.3 ms	-17.71%
❌	WallTime	`test_perf_aws_style_openapi_pydantic_v2`	1.6 s	2 s	-15.53%
❌	WallTime	`test_perf_openapi_large`	2.5 s	2.9 s	-15.29%
❌	WallTime	`test_perf_all_options_enabled`	5.7 s	6.8 s	-15.21%
❌	WallTime	`test_perf_kubernetes_style_pydantic_v2`	2.2 s	2.6 s	-15.95%
❌	WallTime	`test_perf_stripe_style_pydantic_v2`	1.7 s	2 s	-15.63%
❌	WallTime	`test_perf_multiple_files_input`	3.1 s	3.8 s	-17.56%
❌	WallTime	`test_perf_deep_nested`	5.1 s	6.1 s	-15.71%
❌	WallTime	`test_perf_large_models_pydantic_v2`	3.1 s	3.7 s	-16.98%
❌	WallTime	`test_perf_complex_refs`	1.7 s	2 s	-16.78%

98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2026-01-03T12:59:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.40%. Comparing base (a310b6f) to head (14ca21d).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2913      +/-   ##
==========================================
+ Coverage   99.38%   99.40%   +0.02%     
==========================================
  Files          92       95       +3     
  Lines       16342    16910     +568     
  Branches     1934     1991      +57     
==========================================
+ Hits        16241    16809     +568     
  Misses         52       52              
  Partials       49       49

Flag	Coverage Δ
unittests	`99.40% <100.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-01-03T15:44:04Z

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR is a bug fix that corrects incorrect YAML parsing behavior. Previously, scientific notation without decimal points (e.g., 1e-5, 1E+10) was incorrectly parsed as strings instead of floats. The fix makes the YAML parser correctly recognize these as float values per YAML specification. While the generated code output will change for users who have YAML files with such scientific notation defaults, this change is from incorrect behavior to correct behavior - not a breaking change in the conventional sense. No API, CLI, templates, defaults, or Python version support were modified.

This analysis was performed by Claude Code Action

github-actions · 2026-01-03T17:52:56Z

🎉 Released in 0.52.1

This PR is now available in the latest release. See the release notes for details.

Fix YAML scientific notation parsing as float

14ca21d

koxudaxi merged commit b5a361d into main Jan 3, 2026
37 of 38 checks passed

koxudaxi deleted the fix/yaml-scientific-notation-1955 branch January 3, 2026 15:42

github-actions bot added the breaking-change-analyzed label Jan 3, 2026

github-actions bot mentioned this pull request Jan 3, 2026

Casts default values of type number (scientific notation) to str #1955

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix YAML scientific notation parsing as float #2913

Fix YAML scientific notation parsing as float #2913

koxudaxi commented Jan 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

codspeed-hq bot commented Jan 3, 2026

Uh oh!

codecov bot commented Jan 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix YAML scientific notation parsing as float #2913

Fix YAML scientific notation parsing as float #2913

Conversation

koxudaxi commented Jan 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

codspeed-hq bot commented Jan 3, 2026

CodSpeed Performance Report

Merging #2913 will degrade performance by 17.71%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

codecov bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions bot commented Jan 3, 2026

Breaking Change Analysis

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

koxudaxi commented Jan 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 3, 2026 •

edited

Loading

codecov bot commented Jan 3, 2026 •

edited

Loading